2024-07-17 13:47:02.AIbase.10.3k
智源AI has unveiled EVE, its latest generation encoder-less vision-language multimodal large model.
Recently, significant progress has been made in the research and application of multimodal large models. Foreign companies such as OpenAI, Google, and Microsoft have introduced a series of advanced models, and domestic institutions such as Zhipu AI and Jietu Xingchen have also made breakthroughs in this field. These models typically rely on visual encoders to extract visual features and integrate them with large language models, but they face issues such as visual inductive bias due to training